Rank | Count | Beginning |
---|---|---|
26935 | 12018 | Die |
21330 | 5106 | Der |
15004 | 4928 | Das |
62949 | 3399 | In |
49495 | 2738 | Es |
80224 | 2684 | Sie |
60507 | 2363 | Im |
95503 | 1765 | Wir |
47203 | 1757 | Er |
58208 | 1705 | Ich |
8236 | 1562 | Bei |
36217 | 1518 | Diese |
43749 | 1340 | Ein |
92950 | 1311 | Wenn |
72233 | 1159 | Mit |
44095 | 1138 | Eine |
73731 | 929 | Nach |
1849 | 896 | Als |
53585 | 883 | Für |
5892 | 870 | Auf |
5073 | 814 | Auch |
2901 | 789 | Am |
83096 | 716 | So |
70451 | 710 | Man |
86827 | 636 | Und |
35887 | 579 | Dies |
94817 | 578 | Wie |
37370 | 574 | Dieser |
57070 | 556 | Hier |
129 | 548 | Aber |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV